Search CORE

15 research outputs found

Automated Generation of Cross-Domain Analogies via Evolutionary Computation

Author: Baydin Atilim Gunes
de Mantaras Ramon Lopez
Ontanon Santiago
Publication venue
Publication date: 01/01/2012
Field of study

Analogy plays an important role in creativity, and is extensively used in science as well as art. In this paper we introduce a technique for the automated generation of cross-domain analogies based on a novel evolutionary algorithm (EA). Unlike existing work in computational analogy-making restricted to creating analogies between two given cases, our approach, for a given case, is capable of creating an analogy along with the novel analogous case itself. Our algorithm is based on the concept of "memes", which are units of culture, or knowledge, undergoing variation and selection under a fitness measure, and represents evolving pieces of knowledge as semantic networks. Using a fitness function based on Gentner's structure mapping theory of analogies, we demonstrate the feasibility of spontaneously generating semantic networks that are analogous to a given base network.Comment: Conference submission, International Conference on Computational Creativity 2012 (8 pages, 6 figures

arXiv.org e-Print Archive

CiteSeerX

Western Sydney ResearchDirect

FNet: Mixing Tokens with Fourier Transforms

Author: Ainslie Joshua
Eckstein Ilya
Lee-Thorp James
Ontanon Santiago
Publication venue
Publication date: 18/06/2021
Field of study

We show that Transformer encoder architectures can be massively sped up, with limited accuracy costs, by replacing the self-attention sublayers with simple linear transformations that "mix" input tokens. These linear transformations, along with standard nonlinearities in feed-forward layers, prove competent at modeling semantic relationships in several text classification tasks. Most surprisingly, we find that replacing the self-attention sublayer in a Transformer encoder with a standard, unparameterized Fourier Transform achieves 92-97% of the accuracy of BERT counterparts on the GLUE benchmark, but trains nearly seven times faster on GPUs and twice as fast on TPUs. The resulting model, FNet, also scales very efficiently to long inputs. Specifically, when compared to the "efficient" Transformers on the Long Range Arena benchmark, FNet matches the accuracy of the most accurate models, but is faster than the fastest models across all sequence lengths on GPUs (and across relatively shorter lengths on TPUs). Finally, FNet has a light memory footprint and is particularly efficient at smaller model sizes: for a fixed speed and accuracy budget, small FNet models outperform Transformer counterparts

arXiv.org e-Print Archive

LongT5: Efficient Text-To-Text Transformer for Long Sequences

Author: Ainslie Joshua
Guo Mandy
Ni Jianmo
Ontanon Santiago
Sung Yun-Hsuan
Uthus David
Yang Yinfei
Publication venue
Publication date: 03/05/2022
Field of study

Recent work has shown that either (1) increasing the input length or (2) increasing model size can improve the performance of Transformer-based neural models. In this paper, we present a new model, called LongT5, with which we explore the effects of scaling both the input length and model size at the same time. Specifically, we integrated attention ideas from long-input transformers (ETC), and adopted pre-training strategies from summarization pre-training (PEGASUS) into the scalable T5 architecture. The result is a new attention mechanism we call {\em Transient Global} (TGlobal), which mimics ETC's local/global attention mechanism, but without requiring additional side-inputs. We are able to achieve state-of-the-art results on several summarization tasks and outperform the original T5 models on question answering tasks.Comment: Accepted in NAACL 202

arXiv.org e-Print Archive

Functional Interpolation for Relative Positions Improves Long Context Transformers

Author: Ainslie Joshua
Bhojanapalli Srinadh
Guruganesh Guru
Kumar Sanjiv
Li Shanda
Ontanon Santiago
Sanghai Sumit
Yang Yiming
You Chong
Zaheer Manzil
Publication venue
Publication date: 06/10/2023
Field of study

Preventing the performance decay of Transformers on inputs longer than those used for training has been an important challenge in extending the context length of these models. Though the Transformer architecture has fundamentally no limits on the input sequence lengths it can process, the choice of position encoding used during training can limit the performance of these models on longer inputs. We propose a novel functional relative position encoding with progressive interpolation, FIRE, to improve Transformer generalization to longer contexts. We theoretically prove that this can represent some of the popular relative position encodings, such as T5's RPE, Alibi, and Kerple. We next empirically show that FIRE models have better generalization to longer contexts on both zero-shot language modeling and long text benchmarks

arXiv.org e-Print Archive

Generating Maps Using Markov Chains

Author: Ontanon Santiago
Snodgrass Sam
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 30/06/2021
Field of study

In this paper we outline a method of procedurally generating maps using Markov Chains. Our method attempts to learn what makes a "good" map from a set of given human-authored maps, and then uses those learned patterns to generate new maps. We present an empirical evaluation using the game "Super Mario Bros.," showing encouraging results

Association for the Advancement of Artificial Intelligence: AAAI Publications

Story Representation In Analogy-Based Story Generation In Riu

Author: Ontanon Santiago
Zhu Jichen
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/12/2010
Field of study

Computational analogy offers a promising direction to algorithmically generating stories, a key challenge in computational narrative. Since analogy methods are very sensitive to the story representation being used, this paper focuses on story representation for analogy-based story generation. Specifically, we analyze existing story representation formalisms and propose a new approach based on the cognitive semantics theory of force dynamics. Finally, we present the results of our analogy-based interactive narrative system, Riu, to illustrate the utility of our proposal. © 2010 IEEE

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)